# **Topic 8**

## **Control Hazards**

# **Control (Branch) Hazards**

Current implementation

Determination for branch in MEM stage



## **Control Hazards**

- Branch determines flow of control
  - Fetching next instruction depends on branch outcome
  - Pipeline can't always fetch correct instruction
    - When we need to fetch the next instruction, branch is still in the ID stage
    - Target address calculation and register comparison of branch are in the EX stage
    - Branch decision is made in the MEM stage

### **Branch Hazards**

#### Branch outcome determined in MEM





## Resolutions to Branch Hazard

- Stall on branch
- Always assume branch not taken
- Branch prediction

#### Stall on Branch

 Wait until branch outcome is determined before fetching the next instruction



## Reducing Branch Delay

- By moving hardware for determining branch outcome to the ID stage, including
  - Target address adder
  - Register comparator
  - Branch logic (the and gate)

## Change the Hardware

Register comparison, output should "and" with the branch



## Stall on Branch

 Insert only one bubble after moving the branch decision making to ID stage



This instruction might be the one following beq or the one at branch target depending on the comparison result

## **Resolutions to Branch Hazard**

- Stall on branch
- Always assume branch not taken
- Branch prediction

## **Assume Branch Not Taken**

- If we are right, lucky us!
- If we are wrong, penalty will be to flush one (previously, three) instructions

### Flush an Instruction

- If a branch should have been taken, then assumed wrong, then we need to flush the wrongly fetched instruction
- Flush: To discard the wrong instruction in pipeline, equivalent to neutralize all operations
  - Clear IF/ID pipeline register, by a new control signal IF.Flush
    - Flushes the instruction in IF stage





# Should we be concerned by the hardware change?

Timing Issue!

## **Potential Impact to Performance**



# Should we be concerned by the hardware change?

New data hazards!

 If a comparison register is a destination of 2<sup>nd</sup> preceding ALU instruction



Can resolve using new forwarding path

## **Forwarding Paths**

 Forwarding paths are created between stage pipeline register and comparator inputs

What are the conditions to determine Forward1 and Forward2?



If a comparison register is a destination of *immediately* preceding ALU instruction or  $2^{nd}$  preceding load instruction



Need 1 stall cycle even with forwarding



How to detect data hazard for branch? How to stall?

- If a comparison register is a destination of immediately preceding load instruction
  - Need 2 stall cycles



## Resolutions to Branch Hazard

- Stall on branch
- Always assume branch not taken
- Branch prediction (instead of assumption)

### **Branch Prediction**

- Static prediction
  - Based on typical branch behavior
  - Example: loop and if-statement branches
    - Could predict: backward branches always taken
    - Could predict: forward branches always not taken
- Dynamic prediction
  - Hardware measures actual branch behavior
    - e.g., record recent history of each branch in a table
  - Assume future behavior will continue the trend
    - If wrong, take penalty, and update history

## **Dynamic Branch Prediction**

- In deeper and superscalar pipelines, branch penalty is more significant
- Use dynamic prediction
  - Branch prediction buffer (aka branch history table)
  - Indexed by recent branch instruction addresses
  - Stores outcome (taken/not taken)
  - To execute a branch
    - Check table, expect the same outcome
    - Start fetching from fall-through or target
    - If wrong, flush pipeline and flip prediction

## **Branch Prediction Buffer**

| address    | instruction | prediction    |
|------------|-------------|---------------|
| 0x0008F000 | beq         | 1 (taken)     |
| 0x0008F0C4 | beq         | 0 (not taken) |
| 0x0008F310 | bne         | 0             |
| 0x00090000 | beq         | 1             |
|            | •           |               |
|            |             |               |
|            | •           | •             |
|            |             |               |

# 1-Bit Predictor: Shortcoming

Inner loop branches mispredicted twice!

```
outer: ...
inner: ...
beq ..., ..., inner
...
beq ..., ..., outer
```

- Mispredict as taken on last iteration of inner loop
- Then mispredict as not taken on first iteration of inner loop next time around

#### 2-Bit Predictor

 Only change prediction on two successive mispredictions



# **How About J-type?**

- How to implement?
- Is there a control hazard?